強化学習(Reinforcement Learning; RL)
https://www.youtube.com/watch?v=0MNVhXEX9to&list=PLMrJAkhIeNNQe1JXNvaFvURxGY4gE9k74&index=1
https://speakerdeck.com/imai_eruel/reinforcement-learning-for-everyone
モデルベース強化学習(Model-Based RL)
動的計画法(Dynamic Programming; DP)
ベルマン方程式(Bellman equation)
Value iteration
Policy iteration
モデルフリー強化学習(Model-Free RL)
Temporal Difference error; TD error
SARSA
on-policy
Q-learning
off-policy
Actor-Critic法
ドーパミン(Dopamine)